Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT
نویسندگان
چکیده
Objective Quality assurance of large ontological systems such as SNOMED CT is an indispensable part of the terminology management lifecycle. We introduce a hybrid structural-lexical method for scalable and systematic discovery of missing hierarchical relations and concepts in SNOMED CT. Material and Methods All non-lattice subgraphs (the structural part) in SNOMED CT are exhaustively extracted using a scalable MapReduce algorithm. Four lexical patterns (the lexical part) are identified among the extracted non-lattice subgraphs. Non-lattice subgraphs exhibiting such lexical patterns are often indicative of missing hierarchical relations or concepts. Each lexical pattern is associated with a potential specific type of error. Results Applying the structural-lexical method to SNOMED CT (September 2015 US edition), we found 6801 non-lattice subgraphs that matched these lexical patterns, of which 2046 were amenable to visual inspection. We evaluated a random sample of 100 small subgraphs, of which 59 were reviewed in detail by domain experts. All the subgraphs reviewed contained errors confirmed by the experts. The most frequent type of error was missing is-a relations due to incomplete or inconsistent modeling of the concepts. Conclusions Our hybrid structural-lexical method is innovative and proved effective not only in detecting errors in SNOMED CT, but also in suggesting remediation for these errors.
منابع مشابه
Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs
OBJECTIVE We introduce a structural-lexical approach for auditing SNOMED CT using a combination of non-lattice subgraphs of the underlying hierarchical relations and enriched lexical attributes of fully specified concept names. Our goal is to develop a scalable and effective approach that automatically identifies missing hierarchical IS-A relations. METHODS Our approach involves 3 stages. In ...
متن کاملTitle: Detecting Misaligned and Missing Concepts in SNOMED CT using Structural and Lexical Patterns
Objective: Quality assurance of large ontological systems such as SNOMED CT is an indispensable part of the terminology management lifecycle. We introduce a hybrid structural-lexical method for scalable and systematic discovery of novel anomalies in SNOMED CT. The structural component is based on shared isa relations to other concepts. The lexical component leverages shared words in description...
متن کاملIdentifying Missing Hierarchical Relations in SNOMED CT from Logical Definitions Based on the Lexical Features of Concept Names
Objectives. To identify missing hierarchical relations in SNOMED CT from logical definitions based on the lexical features of concept names. Methods. We first create logical definitions from the lexical features of concept names, which we represent in OWL EL. We infer hierarchical (subClassOf) relations among these concepts using the ELK reasoner. Finally, we compare the hierarchy obtained from...
متن کاملIdentifying Potentially Missing Hierarchical Relations in SNOMED CT based on Lexical Features - Impact of Synonyms and Lexico-syntactic Constraints
Introduction The quality assurance of large bio-ontologies is extremely critical for their effective and continued use and is an active area of research1. For example, recent investigations highlighted issues in the hierarchical structure of SNOMED CT and its detrimental effects on biomedical applications2. Previous work by one of the authors3 established a method to identify potentially missin...
متن کاملNEO: Systematic Non-Lattice Embedding of Ontologies for Comparing the Subsumption Relationship in SNOMED CT and in FMA Using MapReduce
A structural disparity of the subsumption relationship between FMA and SNOMED CT's Body Structure sub-hierarchy is that while the is-a relation in FMA has a tree structure, the corresponding relation in Body Structure is not even a lattice. This paper introduces a method called NEO, for non-lattice embedding of FMA fragments into the Body Structure sub-hierarchy to understand (1) this structura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of the American Medical Informatics Association : JAMIA
دوره 24 4 شماره
صفحات -
تاریخ انتشار 2017